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Abstract — In this paper, we study the mixing time of Markov 
Chain Monte Carlo (MCMC) for integer least-square (LS) 
optimization problems. It is found that the mixing time of 
MCMC for integer LS problems depends on the structure of the 
underlying lattice. More speciflcally, the mixing time of MCMC 
is closely related to whether there is a local minimum in the 
lattice structure. For some lattices, the mixing time of the Markov 
chain is independent of the signal-to-noise (SNR) ratio and 
grows polynomially in the problem dimension; while for some 
lattices, the mixing time grows unboundedly as SNR grows. 
Both theoretical and empirical results suggest that to ensure fast 
mixing, the temperature for MCMC should often grow positively 
as the SNR increases. We also derive the probability that there 
exist local minima in an integer least-square problem, which can 
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I. Introduction 

The integer least-square problem is an NP-hard optimization 
problem which has received attention in many research areas, 
for example, communications, global navigation satellite sys- 
tems, radar imaging, Monte Carlo second-moment estimation, 
bioinformatics and lattice design H], J!). A computationally 
efficient way of exactly solving the integer LS problem is 
the sphere decoder (SD) [I], 0-115]. It is known that for 
a moderate problem size and a suitable range of Signal-to- 
Noise Ratios (SNR), SD has low computational complexity, 
which can be significantly smaller than an exhaustive search 
solver But for a large problem size and fixed SNR, the 
average computational complexity of SD is still exponential 
in the problem dimension [61. So for large problem sizes, (for 
example large-scale Multiple-Input Multiple-Output (MIMO) 
systems with many transmit and receive antennas), SD still has 
high computational complexity and is thus computationally 
infeasible. 

Unlike SD, MCMC algorithms perform a random walk over 
the signal space in the hope of finding the optimal solution. 
Gibbs sampling (or Glauber dynamics) is a popular MCMC 
method which performs the random walk according to the 
transition probability determined by the stationary distribution 
of a reversible Markov chain ||2] H]. The Gibbs sampler has 
been proposed for detection purposes in wireless communica- 
tion 0, Uni (see also the references therein). These MCMC 
methods are able to provide the optimal solution if they 
are run for a sufficiently long time; and empirically MCMC 
methods are observed to provide near-optimal solutions in 
a reasonable amount of computational time even for large 
problem dimensions ll9l- lfTTl . ifTTI gave a characterization 



of the MCMC temperature parameter such that the optimal 
solution can be found in polynomial time assuming stationary 
distribution has been reached. However, the understanding of 
the mixing time (or the convergence rate, namely how fast a 
Markov chain converges to the stationary distribution) of these 
MCMC methods is still limited ifTH-fTl. 

In this paper, we are interested in deriving the mixing time 
of the Gibbs sampler for integer LS problems. We derive 
upper and lower bounds on the mixing time and show how the 
mixing time is related to the structures of integer LS problems. 
Our work furthers the understanding of the mixing time in 
MCMC for integer LS problems, and is helpful in optimizing 
the MCMC parameter for better computational performance. 

Our paper is organized as follows. In Section |II] we present 
the system model. The MCMC method and related background 
knowledge are introduced in Section Hill Section HVl IVI VII and 
IVIII derive the bounds on the mixing time and discuss how to 
optimize MCMC parameters to ensure fast mixing. Simulation 
results ai-e given in Section IVIIII Section |IX] concludes this 
paper 

II. System Model 

In this paper, we consider a real-valued integer least-square 
problem with N transmit and N receive dimensions, targeting 
applications in block-fading MIMO antenna systems with 
known channel coefficients. The received signal y e M.^ can 
be expressed as 
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Hx+ V 



(1) 



where x e Q^ is the transmitted signal, and il denotes the 
constellation set. To simplify the derivations in the paper we 
will assume that D. = {±1}. v e R^ is the noise vector where 
each entry is Gaussian AA(0, 1) and independent identically 
distributed (i.i.d.), and H e M^'*^ denotes the channel matrix 
with i.i.d. Af {0, 1) entries. The signal-to-noise ratio is defined 
as 
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which is done in order to take into account the total transmit 
energy. Without loss of generality, we assume that the all minus 



one vector was transmitted, x = -1. Therefore 
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To minimize the average error probabiHty, we need to per- 
form Maximum LikeHhood Sequence Detection (here simply 
referred to as ML detection) given by 
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which is exactly an integer LS problem. 

III. GiBBS Sampling and Mixing Time 

In this paper, we investigate one kind of MCMC detector 
called Gibbs sampler which follows a reversible Markov chain 
and asymptotically converges to the stationary distribution 
lfT4l . Under the stationary distribution, the Gibbs sampler has 
a certain probability of visiting the optimal solution. So if run 
for sufficiently long time, the Gibbs sampler will be able to 
find the optimal solution to (|4|i. 

More specifically, the Gibbs sampler starts with a certain 
A^-dimensional feasible vector x^°-* among the set {-1, +1}^ 
of cardinality 2^. Then the Gibbs sampler performs a ran- 
dom walk over {-1,+!}^ based on the following reversible 
Markov chain. Assume that we are at time index / and the 
current state of the Markov chain is x^'' e {-1,+!}^. In 
the next step, the Markov chain uniform randomly picks one 
position index j out of {1, 2, ..., A^} and keeps the symbols of 
x^'^ at other positions fixed. Then the Gibbs sampler computes 
the conditional probability of transferring to each constellation 
point at the j-th index. With the symbols at the {N - 1) other 
positions fixed, the probability that the j-th symbol adopts the 
value oj, is given by 
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conditioned on the j-th position is chosen, the Gibbs sampler 
will with probability p[ij^ = lj\9] keep oj at the j'th index 
in estimated symbol vector. The initialization of the symbol 
vector x*-*'' can either be chosen randomly or other heuristic 
solutions, a represents a tunable positive parameter which 
controls the mixing time of the Markov chain, this parameter is 
also sometimes called the "temperature". The smaller a is, the 
larger the stationary probability for the optimal solution will 
be, and the easier for the Gibbs sampler to find the optimal 
solution in the stationary distribution. But as we will show 
in the paper, there is often a lower bound on a, in order to 
ensure the fast mixing of the Markov chain to the stationary 
distribution. 

It is not hard to see that the Markov chain of Gibbs sampler 
is reversible and has 2^ states with the stationary distribution 



.V 



SNR 



H> 



■)N 



for an state x. The 2 x 2 transition 
matrix is denoted by P, and the element Pij in the z-th ( 
1 < i < A^) row and j-th ( 1 < j < N) column is the probability 
of transferring to state j conditioned on the previous state is 
i. So each row of P sums up to 1 and the transition matrix 
after t iterations is P*. Denoting the vector for the stationary 
distribution as it, then for an e > 0, the mixing time t{e) is a 
parameter describing how long it takes for the Markov chain 
to get close to the stationary distribution, namely, 

tniix{e) :=min{f :max||P*(x,-) -tt\\tv}, 

X 

where ||/i- i^\\tv is the usual total variation distance between 
two distributions /i and u over the state space { + 1,-1}^. 
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The mixing time is closely related to the spectrum of the 
transition matrix P. More precisely, for a reversible Markov 
chain, its mixing time is generally small when the gap between 
the largest and the second largest eigenvalue of P, namely 
1 - A2, is large. The inverse of this gap j-^ is called the 
relaxation time for this Markov chain. In the next few sections, 
we will discuss how the mixing time is related to specific 
system structures. 

IV. Mixing Time without Local Minima 

In this section, we consider the mixing time for MCMC for 
integer LS problems and study how the mixing time for integer 
LS problem depends on the linear matrix structure and SNR. 
As a first step, we consider a linear matrix H with orthogonal 
columns. As shown later, the mixing time for this matrix has 
an upper bound independent of SNR. In fact, this is a general 
phenomenon for integer LS problems without local minima. 

For simplicity, we incorporate the SNR term into H, and 
the model we are currently considering is 



y = Hx + V, 



(6) 



where the columns of H are orthogonal to each other We 
will also incorporate the SNR term into H this way in the 
following sections unless stated otherwise. 

Theorem IV.l. Independent of the temperature a and SNR, 
the mixing time of the Gibbs sampler for orthogonal-column 
integer least-square problems is upper bounded by N\og{N) + 
log(l/e)iV. 

This theorem is an extension of the mixing time for regular 
random walks on an A^-dimensional hypercube [7|. The only 
difference here is that the transition probability follows (|5]l and 
that the transition probability depends on SNR. 

Proof: When the fc-th index was selected to update in the 
Gibbs sampler, since the columns of H are orthogonal to each 
other, the probability of updating x^ to -1 is }__t,,_ . We 
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note that this probability is independent of the current state of 



Markov chain x. So we can use the classical coupling idea to 
get an upper bound on the mixing time of this Markov Chain. 
Consider two separate Markov chains starting at two differ- 
ent states xi and X2. These two chains follow the same update 
rule according to (|5]l and, by using the random source, each 
step they select the same position index to update and they 
update that position to the same symbol. Let Tcoupie be the 
first time the two chains come to the same state. Then by a 
classical result, the total variation distance 

d{t) =max||P*(x,-) - ttWtv < maxpxi.x2{Tco«pie > t}. (7) 

X Xi,X2 

Note that the coupling time is just time for collecting all of the 
positions where xi and X2 differ, as in the coupon collector 
problem. From the coupon collector results, for any xi and 

X2, 

d(iVlog(iV) +c7V) <pxi,x,{T-co«pie > N\og{N)+cN} < e-". 

(8) 
So the conclusion follows. ■ 

V. Mixing Time with local Minima 

In this section, we consider the mixing time for integer 
LS problems which have local minima besides the global 
minimum point. 

Definition V.l. A local minimum x is a state such that x is 
not a global minimizer for niinsE{-i.+i}N ||y-Hsp; and any 
of its neighbors which differ from x in only one position index, 
denoted by x', satisfies ||y - Hx'p > ||y - Hxp. 

We will use the following theorem about the spectral gap 
of Markov chain to evaluate the mixing time. 

Theorem V.l (Jerrum and Sinclair 1989 |[T5], Lawler and 
Sokal (1988) lfT6l . Q). Let A2 be the second largest eigen- 
value of a reversible transition matrix P, and let j = 1 - A2. 
Then 

where $<. is the bottleneck ratio defined as 



The parameter 7 is upper bounded by 
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Here S is any subset of the state spaces with stationary 
measure no bigger than ^, S'^ is its complement set, and 
Q{S,S'^) is the probability of moving from S to S'^ in one 
step when starting with the stationary distribution. 

Theorem V.3. If there is a local minimum x in an integer 
least-square problem and we denote its neighbor differing only 
at the k-th (1 < k < N) location as x^, then the mixing time 
of the Gibbs sampler is at least 
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where 
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Proof: We apply Theorem IV. 2 1 to prove this result. We 
take a local minimum point x as the single element in the 
bottle-neck set S. Since x is a local minimum, ■k{S) < |. 
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So we know 7 < 2-^- Y.k= 
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From a 



well-known theorem for the relationship between tmixi^) and 
7: tmix{f) >(--!) log(^) Q, our conclusion follows. ■ 

Theorem V.4. For an integer least-square problem where no 
two vectors give the same objective distance, the relaxation 
time (the inverse of the spectral gap) of MCMC is upper 
bounded by a constant as the temperature a ^ if and 
only if there is no local minimum. Moreover, when there is 
a local minimum, as a ^ Q, the mixing time of Markov chain 



'n 



.(e) 



M 



^\^ 



Proof: First, when there is a local minimum, from Theo- 
rem lV.3l and Theorem lV.2l the spectral gap 7 is lower bounded 

by 
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As the temperature a ^ 0, the spectral gap upper bound 
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decreases at the speed of 0(e 
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relaxation time of the MCMC is lower bounded by tmixi^) = 
e ^2^', which grows unbounded as a ^ 0. 

Suppose instead that there is no local minimum. We argue 
that as a ^ 0, the spectral gap of this MCMC is lower bounded 
by some constant independent of a. Again, we look at the 
bottle neck ratio and use Theorem IV. 2 1 to bound the spectral 

gap- 
Consider any set 5* of sequences which do not include the 
global minimum point x*. As a ^ 0, the measure of this 
set of sequences tt{S) < i. Moreover, as a ^ 0, any set S 
with tt{S) < i can not contain the global minimum point 
x*. Now we look at the sequence x' which has the smallest 
distance ||y - Hx'|| among the set S. Since there is no local 

'in this paper, f2(), &{■), and O(-) are the usual scaHng notations as in 
computer science 



minimum, x' must have at least one neighbor x" in S'^ which 
has smaller distance than x'. Otherwise, this would imply x' 
is a local minimum. So 
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which approaches 

l|y-Hi'p. 

, Q<,S.S'') s2 

From Theorem lV.2l the spectral gap 7 is at least — ^^ , 

which is lower bounded by a constant as a ^ 0. ■ 

So from the analysis above, the mixing time is closely 
related to whether there are local minima in the problem. In 
the next section, we will see there often exist local minima, 
which implies very slow convergence rate for MCMC when 
the temperature is kept at the noise level in the high SNR 
regime. 

VI. The Presence of Local Minima 

In this section, we look at the problem of how many local 
minima there are in an integer least-square problem, especially 
when the SNR is high. 

Theorem VI.l. There can be exponentially many local min- 
ima in an integer least-quare problem. 

Proof: Let N be an even integer Consider a matrix 
whose first y columns h^, 1 < i < y have unit norms and 
are orthogonal to each other For the other y columns h^, 
^ + l<i<N,h.i = -(1 + e)hi_ii, where e is a sufficiently 
small positive number (e < 1). We also let y = H(-l), where 
1 is an all-1 vector. So -1 is a globally minimum point for 
this integer LS problem. 

Consider all those vectors x' which, for any 1 < i < — , its 
i-th element and i + y-th element are either simultaneously 
+ 1 or simultaneously -1. When e is smaller than 1, we claim 
that any such a vector except the all -1 vector x, is a local 

N 

minimum, which shows that there are at least 2 2-1 local 
minima. 

Assume that for a certain 1 < i < y, the i-th element and 
(i + Y)-th element of x' are simultaneously -1. Then if we 
change the i-th element to +1, ||y - Hx'p increases by 4; 
and if we change the {i + Y)-th element to +1, ||y - Hx'p 
increases by 4(l+e)^. This is true because the i-th and («+ y)- 
th columns are orthogonal to other {N - 2) columns. 

Similarly, assume that for a certain 1 < i < ^, the i-th 
element and (i + Y)-th element of x' are simultaneously +1. 
Then if we change the i-th element to -1, ||y-Hx'p increases 
by 4(l + e)^- 4e^; and if we change the {i + -j)-th element 
to -1, ||y-Hx'p increases by 4-4e^. ■ 

Now we study how often we encounter a local minimum in 
the specific inter least-square problem model. Without loss of 



generality, we assume that the transmitted sequence is an all 
-1 sequence. We first give the condition for x to be a local 
minimum. We assume that x is a vector which has k ' + 1' over 
an index set K with \K\ = k and {N - k) '-1' over the set 
K = {l,2,...,N}^K. 

Lemma VI.2. x is a local minimum if and only if x is not a 
global minimum; and 
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Proof: For a position i € A', when we flip Xj to 1, ||y - 
Hx'p is increased, namely, 

||y-Hxp-||y-Hi.,p 
= ||-2Eh,+vp-||-2 E h,+vp 

= 4||h,||2+4hf(2 E h,-v) 
< 0, (20) 

where x^^ is a neighbor of x by changing index i. This means 






(21) 



For a position i € K, when we flip x^ to -1, ||y - Hx'p is 
also increased, namely, 

||y-Hip-||y-Hi.,p 
= ||-2Ehj+vp-||-2Ehj-2h, + vp 

= -4||h,p + 4hf(-2Eh,+v) 

< 0. (22) 



This means 






(23) 



It is not hard to see that when SNR ^ 00, v is compara- 
tively small with high probability, so we have the following 
lemma. 

Lemma VI.3. When SNR -^ 00, with high probability, x is 
a local minimum if and only if i+ -1; and 
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By symmetry, for i > 1, 






(25) 



Theorem VI.4. Consider a 2 x 2 matrix H whose two 
columns are uniform randomly sampled from the unit-normed 
2-dimensional vector When v = 0, the probability of there 
existing a local minimum for such an H is i:. 

Proof: When v = 0, clearly x = (-1,-1) is a global 
minimum point, not a local minimum point. It is also clear 
that X = (-1,1) or X = (1,-1) can not be a local minimum 
point since they are neighbors to the global minimum solution. 
So the only possible local minimum point is x = (1,1). 

From Lemma IVI.2I the corresponding necessary and suffi- 
cient condition is 



h^h2< 



I|h2p 



This means the angle 6 between the two 2-dimensional vectors 
hi and h2 satisfy cos{9) < -i. Since hi and h2 are two 
independent uniform randomly sampled vector, the chance for 



that to happen is 



TT-arccos (-^) 



Theorem VI.5. Consider a 2 x 2 matrix H whose elements 
are independent J\f (0,1) Gaussian random variables. When 
V = 0, the probability of there existing a local minimum for 

such an a. is \ — 7= h -^ '^ . 

Proof: When v = 0, clearly x = (-1,-1) is a global 
minimum point, not a local minimum point. It is also clear 
that X = (-1,1) or X = (1,-1) can not be a local minimum 
point since they are neighbors to the global minimum solution. 
So the only possible local minimum point is x = (1,1). 

From Lemma IVI.2I the corresponding necessary and suffi- 
cient condition is 



, T, I l|hi 

hi h2 < - max j 



I|h2p 



This means the angle 6 between the two 2-dimensional vectors 
hi and h2 satisfy 



rir2 cos{6) < - 

where ri and 7-2 are respectively the £2 norm of hi and h2. 

Because the elements of H are independent Gaussian 
random variables, ri and r2 are thus independent random 
variables following the Rayleigh distribution 

p{ri) = rie'~ 
2 

p{r2) = r2e' 2 ; 
while 6 follows a uniform distribution over [0,27r) 
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Since 9 is an independent random variable satisfying 
cos(6') < _^!ifii!l£2l ^jjj 008(6*) > -1, the probability that 
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x= (+1,+1) is a local minimum is given by 
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which is approximately 0.145696. ■ 

For higher dimension N, it is hard to directly estimate the 
probability of a vector being a local minimum based on the 
conditions in Lemma IVI.2I Simulation results instead suggest 
that for large N, with high probability, there exists at least 
one local minimum. The following lemma gives us a sufficient 
condition. For example, if the sum of k columns has a very 
small £2 norm, that will very likely lead to a local minimum. 



Lemma VI.6. 
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Proof: This follows from |hf (X!,eA' ^i ~ i)l < 
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Theorem VI.7. Consider an N x N matrix H whose N 
columns are uniform randomly sampled from the unit-normed 
N -dimensional vector When v = 0, then the expected number 
of local minima for such an H is S{Niocai) 2: Z!fc=2 ( t.)^fc' 
where Pk is the probability that the magnitude of the sum of 
k uniform randomly sampled vectors is less than ij. 

Proof: This follows from Lemma IVI.2I and the fact that 
there are ( j, ) vectors for x which have exactly k +\ in it. ■ 

VII. Choice of Temperature a in High SNR 

In previous sections, we have looked at the mixing time of 
MCMC for an integer LS problem. Now we use the results 
we have accumulated so far to help choose the appropriate 
temperature of a to ensure that the MCMC mixes fast and 
that the optimal solution also comes up fast when the system 
is in a stationary distribution. 

When SNR -* 00, the integer LS problem will have the 
same local minima as the case v = 0. From the derivations 



and simulations, it is suggested that with high probabiHty there 
will be at least one local minimum in the integer LS problem, 
especially for large problem dimension N. 

So following from Lemma WA\ and the reasoning therein, to 
ensure there is an upper bound on the mixing time as SNR -* 
oo, the temperature a should at least grow at a rate such that 
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where x is a local minimum and x' is a neighbor of x, and 
C is a constant. 

This will require that a^ grow as fast as D,{SNR) to ensure 
fast mixing with the existence of local minima. This explains 
that if we keep the temperature at the noise level, it will lead 
to slow convergence in the high SNR regime lfT2l . 

VIII. Simulation Results 

In this section we present simulation results for an A^ x 
A^ system with a full square channel matrix containing i.i.d. 
Gaussian entries. 

In Figure [T] we plot the expected number of local minima in 
a system as the problem dimension N grows. For each N, we 
generate 100 random channel matrices and for each matrix, we 
examine the number of local minima by exhaustive search. As 
the problem dimension N grows, the number of local minima 
grows rapidly. 

In Figure |2] we plot the probability of there existing a local 
minimum as the problem dimension N grows. For each N, 
we generate 100 random channel matrices and for each matrix, 
we examine whether there exists local minimum by exhaustive 
search. As N grows, the empirical probability of there existing 
at least one local minimum approaches 1. It is interesting to 

1 1 2arctan(\/4) 

see that for N = 2, our theoretical result ^ — y=-\ y=-^ — « 

0.15 matches well with the simulations. 

We also examine how the spectral gap for MCMC is 
related to the existence of local minima. For N = 5 and 
SNR = 10, we randomly generated 10 problem instances 
and keep the temperature a^ = 1 the same as the noise 
variance. Out of the 10 trials, the number of local minima 
are 2,1,0,0,0,2,0,0,2 and 0. The corresponding spectral 
gaps are respectively 0.0037,0.0008,0.1244,0.1957,0.1989, 
0.0011, 0.1698, 0.1764, 5 x 10"^°, and 0.1266. It can be 
seen that when there exist local minima, the spectral gap is 
significantly smaller than the cases without local minima. This 
implies a slower mixing for the systems with local minima, 
which is consistent with our theoretical results. 

IX. Conclusion 

In this paper, we study the mixing time of Markov Chain 
Monte Carlo (MCMC) for the integer least-square optimiza- 
tion problem. It is found that the mixing time of MCMC for 
the integer least-square problem depends on the structure of 
the underlying lattice. More specifically, the mixing time of 
MCMC is found to be closely related to whether there is a 
local minimum in the lattice structure of the integer least- 
square problem. For some lattices, the mixing time of the 




Figure 1 : Average Number of Local Minima 




Figure 2; The Probability of Having Local Minima 



Markov chain is independent of the signal-to-noise ratio; while 
for some lattices, the mixing time is correlated with the signal- 
to-noise ratio. We also derive the probability that there exist 
local minima in an integer least-square problem, which can be 

1 1 2arctan(W^) 

as high as ^-^h ^ . Both theoretical and empirical 

results suggest that to ensure fast mixing for the MCMC for 
the integer least-square problem, the temperature for MCMC 
should often grow as the signal-noise-ratio increases. 

References 

[1] E. Agrell, T. Eriksson, A. Vardy, and K. Zeger, "Closest point search in 

lattices," IEEE Transactions on Information Theory, vol. 48, no. 8, pp. 

2201-2214, 2002. 
[2] M. A. Bomo, "Reduction in Solving Some Integer Least Squares 

Problems," Tiiesis, McGill University, 2011. 
[3] M. O. Damen, H. E. Gamal, and G. Caire, "On Maximum-Likelihood 

Detection and the Search for the Closest Lattice Point," IEEE Trans, on 

Info. Theoiy, vol. 49, pp. 2389-2402, Oct. 2003. 
[4] B. M. Hochwald and S. Ten Brink, "Achieving near-capacity on a 

multiple-antenna channel," IEEE Trans, on Comniun., vol. 51, no. 3, 

pp. 389-399, 2003. 



[5] B. Hassibi and H. Vikalo, "On the Sphere-Decoding Algorithm. I. 
Expected Complexity," IEEE Trans, on Sig. Proc, vol. 53, pp. 2806- 
2818, Aug. 2005. 
[6] J. Jalden and B. Ottersten, "On the Complexity of Sphere Decoding 
in Digital Communications," IEEE Trans, on Sig. Proc, vol. 53, pp. 
1474-1484, Apr 2005. 
[7] D. Levin, Y. Peres, and E. Wilmer, Markov Chains and Mixing Times. 

American Mathematical Society, 2008. 
[8] O. Haggstrom. Finite Markov chains and algorithmic applications. 

Cambridge University Press, 2002. 
[9] H. Zhu, B. Farhang-Boroujeny, and R. Chen, "On performance of sphere 
decoding and Markov chain Monte Carlo detection methods," IEEE Sig. 
Proc. Letters, vol. 12, pp. 669-672, 2005. 

[10] X. Wang and V. H. Poor, Wireless Communications Systems: Advanced 
Techniques for Signal Reception. Prentice Hall, 2003. 

[11] M. Hansen, B. Hassibi, A. Dimakis, and W. Xu, "Near-Optimal De- 
tection in MIMO Systems using Gibbs Sampling ," in Globecom'09, 
2009. 

[12] B. Farhang-Boroujeny, H. Zhu, and Z. Shi, "Markov chain Monte 
Carlo algorithms for CDMA and MIMO communication systems," IEEE 
Trans, on Sig. Proc, vol. 54, no. 5, pp. 1896-1909, 2006. 

[13] R. Chen, S. J. Liu, and X. Wang, "Convergence analyses and compar- 
isons of Markov chain Monte Carlo algorithms in digital communica- 
tions," IEEE Transactions on Signal Processing, vol. 50, pp. 255-270, 
2002. 

[14] D. MacKay, Information theory, inference and learning algorithms. 
Cambridge University Press, 2003. 

[15] M. Jerram and A. Sinclair, "Approximating the Permanent," SIAM 
Journal on Computing, vol. 18, pp. 1149-1178, 1989. 

[16] G. Lawler and A. Sokal, "Bounds on the L spectrum for Markov 
Chains and Markov Processes: a Generahzation of Cheeger's Inequality," 
Trails. Amer. Math. Soc, vol. 309, pp. 557-580, 1988. 



